Mechanism to dynamically allocate physical storage device resources in virtualized environments

ABSTRACT

A storage device is disclosed. The storage device may include storage for data and at least one Input/Output (I/O) queue for requests from at least one virtual machine (VM) on a host device. The storage device may support an I/O queue creation command to request the allocation of an I/O queue for a VM. The I/O queue creation command may include an LBA range attribute for a range of Logical Block Addresses (LBAs) to be associated with the I/O queue. The storage device may map the range of LBAs to a range of Physical Block Addresses (PBAs) in the storage.

RELATED APPLICATION DATA

This application is a divisional of U.S. patent application Ser. No.15/959,108, filed Apr. 20, 2018, now allowed, which claims the benefitof U.S. Provisional Patent Application Ser. No. 62/642,596, filed Mar.13, 2018, and which is a continuation-in-part of U.S. patent applicationSer. No. 14/862,145, filed Sep. 22, 2015, now U.S. Pat. No. 10,838,852,issued Nov. 17, 2020, which claims the priority benefit under 35 U.S.C.§ 119(e) of U.S. Provisional Patent Application Ser. No. 62/149,509,filed Apr. 17, 2015. U.S. patent application Ser. No. 15/959,108, filedApr. 20, 2018, U.S. patent application Ser. No. 14/862,145, filed Sep.22, 2015, and U.S. Provisional Patent Application Ser. No. 62/642,596,filed Mar. 13, 2018 are incorporated by reference herein for allpurposes.

This application is related to U.S. patent application Ser. No.17/024,649, filed Sep. 17, 2020, now pending, which is a continuation ofU.S. patent application Ser. No. 14/862,145, filed Sep. 22, 2015, nowU.S. Pat. No. 10,838,852, issued Nov. 17, 2020, which claims thepriority benefit under 35 U.S.C. § 119(e) of U.S. Provisional PatentApplication Ser. No. 62/149,509, filed Apr. 17, 2015.

This application is related to U.S. patent application Ser. No. ______,filed ______, which is a continuation of U.S. patent application Ser.No. 15/959,108, filed Apr. 20, 2018, now allowed, which claims thebenefit of U.S. Provisional Patent Application Ser. No. 62/642,596,filed Mar. 13, 2018, and which is a continuation-in-part of U.S. patentapplication Ser. No. 14/862,145, filed Sep. 22, 2015, now U.S. Pat. No.10,838,852, issued Nov. 17, 2020, which claims the priority benefitunder 35 U.S.C. § 119(e) of U.S. Provisional Patent Application Ser. No.62/149,509, filed Apr. 17, 2015. U.S. patent application Ser. No.15/959,108, filed Apr. 20, 2018, U.S. patent application Ser. No.14/862,145, filed Sep. 22, 2015, and U.S. Provisional Patent ApplicationSer. No. 62/642,596, filed Mar. 13, 2018 are incorporated by referenceherein for all purposes.

FIELD

The inventive concepts relate generally to storage devices, and moreparticularly supporting access to storage devices by virtual machinesthat may be isolated from each other.

BACKGROUND

Single Root Input/Output Virtualization (SR-IOV) is aspecification-backed interface mechanism that allows a single physicalPeripheral Component Interconnect Express (PCIe) device to appear asmultiple, separate physical PCIe devices. SR-IOV helps share and isolatePCIe resources for performance and manageability reasons whilesimultaneously promoting interoperability.

SR-IOV has existed for a decade for network adapters. Very recently,SR-IOV has begun to include storage. Central Processing Unit (CPU)processing already provides resource isolation that has helped rapidvirtualization adoption through a hypervisor as a primary device andvirtual machines (VMs) as secondary devices. With SR-IOV, network andstorage devices expose a physical function device (PF) and virtualfunction devices (VFs). These jointly provide device isolationsufficient to transform a physical server into multiple virtual serversso entire applications may each run in their own isolated space.

Even though computing processing, network, and storage form the threevirtualization pillars, storage devices and storage device vendors havelagged in conforming to SR-IOV. This fact may be because, unlike withnetworking, storage defines a data address space referenced by a rangeof logical block addresses (LBAs). This LBA range may only be subdividedinto a finite number of units. Moreover, storage devices requirephysical hardware gates to support additional VFs, since a VF arehardware functionality exposed directly to a VM's Peripheral ComponentInterconnect (PCI) space. Adding SR-IOV to a storage/network deviceincreases its gate count and chip size and consumes more power.

SR-IOV solves the hardware isolation issue while providing bare-metalperformance since, unlike para-virtualized devices, I/O does not have togo through the hypervisor. Non-Volatile Memory Express (NVMe) storagedevices are the latest to adopt SR-IOV. But for storage there may beother mechanisms to provide isolated access for multiple VMs.

A need remains for a way to offer functionality like that offered bySR-IOV, but without the hardware requirements and limitations imposed bySR-IOV.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a device supporting isolated Virtual Machine (VM) access toa storage device, according to an embodiment of the inventive concept.

FIG. 2 shows additional details of the device of FIG. 1.

FIG. 3 shows a path of communication between the VMs of FIG. 1 and thestorage device of FIG. 1, where the storage device of FIG. 1 exposesonly one physical function.

FIG. 4 shows a path of communication between the VMs of FIG. 1 and thestorage device of FIG. 1, where the storage device of FIG. 1 exposesmultiple physical functions.

FIG. 5 shows details of the storage device of FIG. 1.

FIG. 6 shows an extended I/O queue creation command for the storagedevice of FIG. 1.

FIG. 7 shows the physical storage of the storage device of FIG. 1divided into multiple namespaces.

FIG. 8 shows memory mapping of doorbells in the storage device of FIG. 1to support VM isolation.

FIG. 9 shows an extended virtual I/O queue creation command for theField Programmable Gate Array (FPGA) of FIG. 3.

FIG. 10 shows the FPGA of FIG. 3 supporting virtual I/O queues that mapto I/O queues in the storage device of FIG. 1.

FIG. 11 shows a flowchart of an example procedure for the storage deviceof FIG. 1 to allocating an I/O queue for a VM, according to anembodiment of the inventive concept.

FIG. 12 shows a flowchart of an example procedure for the FPGA of FIG. 3to allocating a virtual I/O queue for a VM, according to an embodimentof the inventive concept.

FIG. 13 shows a flowchart of an example procedure for the hypervisor ofFIG. 3 to process administrative requests from the VMs of FIG. 3,according to an embodiment of the inventive concept.

FIG. 14 shows a flowchart of an example procedure for the storage deviceof FIG. 1 or the FPGA of FIG. 3 to map memory addresses of doorbells todifferent operating system pages to support VM isolation, according toan embodiment of the inventive concept.

DETAILED DESCRIPTION

Reference will now be made in detail to embodiments of the inventiveconcept, examples of which are illustrated in the accompanying drawings.In the following detailed description, numerous specific details are setforth to enable a thorough understanding of the inventive concept. Itshould be understood, however, that persons having ordinary skill in theart may practice the inventive concept without these specific details.In other instances, well-known methods, procedures, components,circuits, and networks have not been described in detail so as not tounnecessarily obscure aspects of the embodiments.

It will be understood that, although the terms first, second, etc. maybe used herein to describe various elements, these elements should notbe limited by these terms. These terms are only used to distinguish oneelement from another. For example, a first module could be termed asecond module, and, similarly, a second module could be termed a firstmodule, without departing from the scope of the inventive concept.

The terminology used in the description of the inventive concept hereinis for the purpose of describing particular embodiments only and is notintended to be limiting of the inventive concept. As used in thedescription of the inventive concept and the appended claims, thesingular forms “a,” “an,” and “the” are intended to include the pluralforms as well, unless the context clearly indicates otherwise. It willalso be understood that the term “and/or” as used herein refers to andencompasses any and all possible combinations of one or more of theassociated listed items. It will be further understood that the terms“comprises” and/or “comprising,” when used in this specification,specify the presence of stated features, integers, steps, operations,elements, and/or components, but do not preclude the presence oraddition of one or more other features, integers, steps, operations,elements, components, and/or groups thereof. The components and featuresof the drawings are not necessarily drawn to scale.

Presently, early adopters of Single-Root Input/Output Virtualization(SR-IOV) for Non-Volatile Memory Express (NVMe) storage devices havetaken the route of providing a finite number of virtual functions (VFs)per physical function (PF) and associated namespaces. The allocation ofthe logical address space range depends on the namespace allocations,while the bare-metal direct access for VMs depends on the number ofsupported VFs. This implementation creates problems for a storage devicethat supports large capacities, since it must support additionalhardware at the cost of power and die size to meet the performancedemands of directly supporting multiple VMs. At a minimum, supportingSR-IOV functionality requires devices to provide each VF a separatePeripheral Component Interconnect (PCI) config space, I/O bars, I/Oqueues for submission and completion queues, Message Signaled Interrupts(MSI-X), and doorbell addresses for supported queues.

The deficiency of a fixed allocation approach available in the firstSR-IOV storage devices prompted the NVMe committee to develop anotherspecification that removed implementation limitations, such as changingfixed resource allocations to dynamic resource allocations andmanagement. Numerous currently open and active proposed changes aretrying to solve resource allocation issues. However, they are bound byphysical device limits such as supported VFs adding to the speccomplexity that inevitably appears.

Although there are a handful of leading hypervisors in the market suchas VMWare, Microsoft, Citrix XenServer, KVM, Qemu, and Oracle VM, only afew from this set are currently actively adopting to market changes. Indoing so, they have captured a majority market share. Due to the verynature of their programming environment, each hypervisor environment maybe considered a custom implementation. Supporting such implementationalvariation may therefore be important.

Embodiments of the inventive concept define a simpler mechanism toprogram SR-IOV storage devices to provide VM isolation and performance.Embodiments of the inventive concept follow a simpler approach byrelying on the hypervisor to map resources. Embodiments of the inventiveconcept may include policies for:

1) Extending performance to a VM by providing direct application accessto a VM's I/O queues (a submission and completion queue pair).

2) Directly mapping an I/O queue's (submission and completion queue)capabilities to a VM.

3) Encapsulating controller specifics in the hypervisor for managementpaths.

4) Dynamically isolating physical resources per individual VMrequirements.

Embodiments of the inventive concept may effectively provide SR-IOV-typefunctionality by adding new NVMe storage device support as a feature.

Embodiments of the inventive concept define the following new features:

1) An advanced management mechanism to remap an NVMe I/O submissionqueue and I/O completion queue (together known as I/O queue pair)directly to a VM for performance purposes.

2) A mechanism in the hypervisor to implement a virtualized controllerthat maps hardware I/O queues dynamically to a VM as a set of thevirtualized controller.

3) A mechanism to map a logical address space as a logical unit ornamespace.

4) A mechanism to map additional I/O queues not available in the storagedevice using additional methods.

5) A method to provide VM specific isolation for a shared resource.

Embodiments of the inventive concept provide the following advantagesover conventional technology:

1) Providing SR-IOV-like functionality without expensive SR-IOV hardwarerequirements.

2) Providing VMs additional I/O resource isolation and performancebenefits that surpass device specifications.

3) Providing hardware ability that may be fully virtualized to avoidoperating system (O/S) in-box driver changes.

4) Creates a Quality of Service (QoS) channel between storage devicesand a hypervisor.

5) Simplifies hardware requirements for storage virtualization.

Embodiments of the inventive concept provide a mechanism to remapstorage device I/O resources directly to VMs. Embodiments of theinventive concept use existing remapping resources for memory andinterrupts, including an Input-Output Memory Management Unit (IOMMU) forx86 architectures and additional hardware resources to map to a largenumber of VMs that the storage device would not normally support.

To achieve these benefits, a storage device, which may include a SolidState Drive (SSD), should:

1) Support extended I/O queue creation properties.

2) Support simple logical address remapping at the I/O queue level.

3) Support doorbells at a O/S page boundary for VM security.

4) Advertise these extended attributes through one or more fields perNVMe specification standards.

5) Optionally provide additional queue priority class arbitrationmechanisms than default to apply QoS.

Mapping a NVMe I/O Queue Pair Directly to a VM

An I/O submission queue and I/O completion queue may together be knownas an I/O queue pair since they function together and, when resourcesare limited, will apply a 1:1 mapping automatically. Most (if not all)in-box NVMe drivers create a 1:1 mapping of these queues. Embodiments ofthe inventive concept target this usage since these are the devicedriver implementations that run in a VM.

A storage device that supports embodiments of the inventive concept mayprovide extended I/O creation commands for submission and completionqueues. The command may be applied through a separate operational codethat may be marked as optional or may be defined as a vendor-specificcommand. The extended command may support the basic NVMe queue creationcommand specifics the NVMe specification defines, such as queue size,queue identifier, queue priority class, whether the queue buffer isphysically contiguous, interrupt vector, and interrupt-enabled fields.But in addition to these, the extended command may also support logicalblock addressing offset within the device's address space. Themechanisms work as follows:

1) When a VM is created, the hypervisor exposes a virtualized NVMestorage device in the VM's PCI space. The device may initially be fullyvirtualized to take guest exits on access to its PCI configuration andI/O memory mapped regions. For interrupts, the hypervisor may setupspecific MSI-X interrupts to directly interrupt the guest VM asappropriate per the hypervisor's implementation.

2) In the I/O memory mapped space, the hypervisor may trap any NVMeconfiguration space changes and virtualizes the access request asappropriate. The hypervisor may expose doorbells as part of the I/Omemory at an O/S page level granularity so that the hypervisor may mapeach doorbell at a VM level. The storage device may also support thisfeature.

3) When a VM creates an I/O submission queue, the hypervisor traps therequest and maps it to a storage device's physical I/O submission queueusing the “Extended Create I/O Submission Queue” command. The hypervisorand the storage device may use the values provided by the VM andchange/add the following:

a) Map the queue memory to its physical page(s) so the storage devicehas direct access to it. This mechanism may be provided by the IOMMU onIntel x86 based architectures.

b) Add a queue priority, if supported, that prioritizes this queue withrespect to other VMs.

c) Binds a previously created I/O completion queue ID in the VM's spaceto this submission queue with respect to the storage device I/O queue.

d) Adds logical block address start and end values that map a part ofthe physical address space to the VM.

e) Applies appropriate QoS requirements for either minimum or maximumI/O operations, or minimum or maximum bytes transferred per secondgranularity.

f) Additionally, the storage device may provide global namespace accessprivileges if the VM requires it. These global namespace accessprivileges may be specified in an array that lists the namespace ids.

g) Additionally, if the storage device provides such support, thestorage device may provide namespace access-type privileges such as readonly, read write, exclusive access.

4) The hypervisor may also trap I/O completion queue creation requestsand instead:

a) Map the queue memory to its physical page(s) so that the storagedevice has direct access to it. On Intel x86 based architectures, theIOMMU may provide this mechanism.

b) Map the interrupt vector provided between an actual storage devicevector and the VM through system architecture mechanisms already inplace for virtualization, such as the IOMMU.

5) Depending on managed VM complexity and hypervisor implementation,embodiments of the inventive concept may map multiple I/O creationrequests to a single physical queue. If implemented, the FPGA may handlethe I/O completion queues and may interrupts back to the VM guest. Thismechanism may address the dynamic queue allocation mechanism.

6) In another usage of dynamic queue allocation mechanism, thehypervisor may expose only the require I/O queues based on the VMsService Level Agreements (SLAs).

Once setup, the hypervisor-aided mechanisms that may map hardware I/Oqueues dynamically to a VM may provide the necessary isolation andperformance benefits similar to those SR-IOV provides, but with lowermanufacturing, testing and debugging costs. The hypervisor may alsonecessitate some changes, but these changes are self-contained andrestricted to the available hypervisors, reducing the overall impact. Ifthe storage device is installed in a system that does not supportvirtualization, the storage device should perform as a regular NVMestorage device.

Managing Address Space Isolation

NVMe may share a single, unique logical address space by usingNamespaces which work like Small Computer Systems Interface (SCSI)Logical Unit Numbers (LUNs). Given a Namespace ID, the logical addressblocks may be offset with an address where the namespace starts in thefull logical address map.

In conventional storage device, the physical address space may besubdivided into logical units by creating namespaces with theirco-ordinates. In conventional storage device, this subdivision mayrequire additional Namespace support for creating multiple namespaces.Embodiments of the inventive concept bypass this requirement byattaching the created I/O queue directly to a logical unit space. Forexample, extended attributes may be defined as part of the I/O queue'slogically addressable space, so the default namespace maps to it. Thismodification aligns each I/O queue, giving a VM direct access to theappropriate space. Typically VMs either request access to only a privatenamespace or to a shared namespace. The extended attributes in I/O queuecreation may specify the default namespace start and end LBAs withrespect to the global physical address space. This mechanism addresseshardware Namespace management requirements. Any incoming I/O requestmust go through an I/O queue. Since the I/O queue already holds thedefault namespace mapping offset, it may translate the LBA addressesdirectly using this programmed offset.

If access to multiple namespaces is needed, the extended I/O queuecreation may have an additional definition to support global namespaceaccess. For example, I/O queue 23: Global Namespace accessed: 3.

Mechanism to Map Additional I/O Queues not Available in the StorageDevice

This mechanism involves the use of additional Field Programmable GateArray (FPGA) logic that provide the I/O queues in a separate space.Using this mechanism, embodiments of the inventive concept may supportsignificantly more I/O queues that may be mapped directly to VMs. TheFPGA may support the NVMe specification or a subset of the specificationthrough which this mechanism is arbitrated.

Full specification support: In this mechanism, the FPGA may fully mimicthe NVMe specification and provides full specification-levelfunctionality. The FPGA may provide the memory map for associated I/Oqueue doorbells at an O/S page granularity, the support for additionalI/O queues not available in the device, full MSI-X interrupt support forI/O queues supported, and mapping logical address space per I/O queue.In this mechanism, the storage device need not support the extended I/Oqueue functionality at all, which may be fully implemented in theadditional programmable hardware.

Partial specification support: If the number of VMs may be predicted tobe no greater than the number of I/O queues supported by the storagedevice, part or all of the functionality may be implemented within thestorage device. The FPGA may then function as a “pass-through” devicefor I/O requests by the VM, or using a one-to-one mapping of virtual I/Oqueues to storage device I/O queues. The FPGA may still be used toreduce the amount of hardware used to implement functionality in thestorage device.

In either case, the FPGA logic may provide the necessary isolationgranularity to each VM. To provide a large number of I/O queues, theFPGA may map many of its exposed I/O queues to a single storage deviceI/O queue.

The FPGA may provide the logical address space mapping constructs andthe associated namespace-like isolation The FPGA may also provide thenecessary MSI-X interrupt mapping capability for this device to be fullyfunctional.

Quality of Service

When multiple VMs perform IO operations that access a single storagedevice, they inhibit performance due to the blender effect. The storagedevice does not have access to specific VMs when a device is sharedwithout isolation at the VM level. With SR-IOV, the isolation isprovided at the PF mapping level, but the benefits are not advertised orare unknown. Embodiments of the inventive concept may support binding anI/O queue resource to a VM that provides natural isolation-not only atthe resource level, but also to the VM I/O requests in flow. The storagedevice has full VM knowledge by the configuration I/O queueidentification. If the storage device supports additional priority classarbitration mechanisms (as defined in the NVMe specification or part ofa vendor-specific command), the hypervisor may apply them to the I/Oqueue at creation time. The hypervisor may choose to apply thesedifferent priority class support based on a VM's requirements.

Based on the storage device functionality provided, embodiments of theinventive concept may also expose an additional field in the extendedI/O queue creation command that provides a performance limit or minimumservice required to apply to the storage device. The performance limitor minimum service required may be quantified in read and write I/Ocount or bytes transferred. Such a performance limit or minimum servicerequired may also be a settable option based on device support and bythe hypervisor to apply to the VM.

Typical usages of embodiments of the inventive concept may applydirectly to storage virtualization in the enterprise segment whichheavily utilizes virtual machines (VMs).

FIG. 1 shows a device supporting isolated Virtual Machine (VM) access toa storage device, according to an embodiment of the inventive concept.In FIG. 1, device 105, which may also be called a host computer or hostdevice, is shown. Device 105 may include processor 110. Processor 110may be any variety of processor: for example, an Intel Xeon, Celeron,Itanium, or Atom processor, an AMD Opteron processor, an ARM processor,etc. While FIG. 1 shows a single processor 110 in device 105, device 105may include any number of processors, each of which may be single coreor multi-core processors, and may be mixed in any desired combination.Processor 110 may run device driver 115, which may support access tostorage device 120: different device drives may support access to othercomponents of device 105.

Device 105 may also include memory controller 125, which may be used tomanage access to main memory 130. Memory 130 may be any variety ofmemory, such as flash memory, Dynamic Random Access Memory (DRAM),Static Random Access Memory (SRAM), Persistent Random Access Memory,Ferroelectric Random Access Memory (FRAM), or Non-Volatile Random AccessMemory (NVRAM), such as Magnetoresistive Random Access Memory (MRAM)etc. Memory 120 may also be any desired combination of different memorytypes.

Although FIG. 1 depicts device 105 as a server (which could be either astandalone or a rack server), embodiments of the inventive concept mayinclude device 105 of any desired type without limitation. For example,device 105 could be replaced with a desktop or a laptop computer or anyother device that may benefit from embodiments of the inventive concept.Device 105 may also include specialized portable computing devices,tablet computers, smartphones, and other computing devices.

FIG. 2 shows additional details of device 105 of FIG. 1. In FIG. 2,typically, device 105 includes one or more processors 110, which mayinclude memory controllers 125 and clocks 205, which may be used tocoordinate the operations of the components of device 105. Processors110 may also be coupled to memories 130, which may include random accessmemory (RAM), read-only memory (ROM), or other state preserving media,as examples. Processors 110 may also be coupled to storage devices 120,and to network connector 210, which may be, for example, an Ethernetconnector or a wireless connector. Processors 110 may also be connectedto buses 215, to which may be attached user interfaces 220 andInput/Output interface ports that may be managed using Input/Outputengines 225, among other components.

FIG. 3 shows a path of communication between the VMs of FIG. 1 andstorage device 120 of FIG. 1, where storage device 120 of FIG. 1 exposesonly one physical function. In FIG. 3, three VMs 305-1, 305-2, and 305-3are shown, which may be instantiated on host device 105 of FIG. 1. WhileFIG. 3 shows three VMs 305-1, 305-2, and 305-3, embodiments of theinventive concept may include host device 105 of FIG. 1 supporting anynumber of VMs.

VMs 305-1, 305-2, and 305-3 may communicate with hypervisor 310.Hypervisor 310 may create, manage, and run VMs 305-1, 305-2, and 305-3.Hypervisor 310 is usually implemented as software running on processor110 of host device 105 of FIG. 1.

To achieve interaction with hardware devices, particularly hardwareimplementing Single Root Input/Output Virtualization (SR-IOV), suchhardware devices may expose various physical functions. For example,FIG. 3 shows Field Programmable Gate Array (FPGA) 315 exposing onephysical function (PF) 320.

To enable VMs 305-1, 305-2 and 305-3 to interact with the hardwaredevices, various virtual functions (VFs) may be exposed as well. Ratherthan having to interact directly with the hardware, VMs 305-1 305-2, and305-3 may interact with VFs 325-1, 325-2, and 325-3. VFs 325-1 325-2,and 325-3 offer virtualized versions of PF 320, enabling VMs that usedifferent operating systems (O/Ss) to interact efficiently (for example,using native O/S drivers) with the underlying hardware device. WhileFIG. 3 shows three VFs for PF 320, embodiments of the inventive conceptmay include any number of VFs per VF; and if FPGA 315 (or storage device120) exposes more than one PF, there may be varying numbers of VFs perPF. Each VF exposed may require some hardware support from theunderlying device. For example, each VF exposed for storage device 120may require additional endtraps and doorbells, which require additionalhardware implementation.

As noted above, PF 320 may be exposed by FPGA 315, rather than storagedevice 120, which is the underlying hardware that implements PF 320.FPGA 315 may interrogate storage device 120 to determine which PF(s) areexposed by storage device 120, and then expose comparable PF(s) itself,which may map directly across FPGA 315 to the corresponding PF ofstorage device 120.

Alternatively, PF 320 may be exposed directly by storage device 120,whereas VFs 325-1, 325-2, and 325-3 may be exposed by FPGA 315.Embodiments of the inventive concept that operate using suchimplementations may avoid FPGA 315 having to implement in hardware PFfunctionality already implemented by storage device 120, butcomplementing that hardware implementation with VFs that may be greaterin number than storage device 120 could itself offer.

Regardless of whether storage device 120 or FPGA 315 implements PF 320,FPGA 315 may be interposed between processor 110 (and hypervisor 310) onthe one hand and storage device 120 on the other hand. Anycommunications between hypervisor 310 (and therefore VMs 305-1, 305-2,and 305-3) and storage device 120 would pass through FPGA 315, allowingFPGA 315 to augment the functionality offered by storage device 120 (orpotentially to offer SR-IOV-type support for storage device 120 whenstorage device 120 does not itself offer SR-IOV-type implementation).

In terms of implementation, FPGA 315 may be hardware within storagedevice 120 (that is, storage device 120 may include FPGA 315 internallyto its structure), or FPGA 315 may be additional hardware external tostorage device 120 but still along the communication path betweenprocessor 110 and storage device 120. For example, FPGA 315 may beimplemented as a circuit board installed within host device 105 of FIG.1 that receives data from a Peripheral Component Interconnect Express(PCIe) bus, with a connection from FPGA 315 to storage device 120.Regardless of how FPGA 315 is implemented, FPGA 315 should be somewherein between processor 110 and storage device 120, to capture informationsent by hypervisor 310 and perform the functionality of FPGA 315.

Hypervisor 310 may trap administrative requests from VMs 305-1, 305-2,and 305-3 to manage their processing. For example, hypervisor 310 maytrap information sent to the Peripheral Component Interconnect (PCI)configuration space or requests destined for the administrative queue ofstorage device 120 and either process them locally, redirect therequests to FPGA 315, or generate new requests that are similar to theoriginal requests (although different in specific ways that depend onthe specific requests).

FPGA 315 offers an advantage over implementing SR-IOV functionalitywithin storage device 120 in that FPGA 315 may be programmed on-site aspart of the installation process: storage device 120 is usuallyprogrammed during manufacture. For example, FPGA 315 might be capable ofsupporting, say, 100 VFs for a storage device, but at the time ofinstallation the customer might want to expose only, say, 50 VFs (sincehost device 105 of FIG. 1 might not be capable of supporting that manyVMs). FPGA 315 may then be programmed, using conventional techniques, toexpose only 50 VFs, leaving the remaining gates unused (and thereforenot consuming power). For example, FPGA 315 may be programmed at thetime of manufacture with a shell that offers an interface usingNon-Volatile Memory Express (NVMe) commands. The customer, afterinstalling FPGA 315 into host device 105 of FIG. 1, may use these NVMecommands to customize FPGA 315 as desired.

FPGA 315 may also be programmed as desired in terms of what parametersmay be supported. For example, as discussed below with reference to FIG.9, there are numerous different variations on QoS requirements. But FPGA315 might be programmed to consider only bandwidth, and to ignoreparameters relating to numbers of I/O requests or numbers of bytesprocessed per second. The same possibility is true with respect to otherparameters of the extended I/O queue creation command.

FIG. 4 shows a path of communication between the VMs of FIG. 1 andstorage device 120 of FIG. 1, where storage device 120 of FIG. 1 exposesmultiple physical functions. FIG. 4 is identical to FIG. 3, except thatinstead of exposing one PF 320, in FIG. 4 three PFs 320, 405, and 410are exposed, with one VF 325-1, 325-2, and 325-3 exposed for each PF.While FIG. 4 shows FPGA 315 exposing three PFs 320, 405, and 410,embodiments of the inventive concept may support exposing any number ofPFs, and any number of VFs 325-1, 325-2, and 325-3 may be exposed foreach PF. Multiple PFs might be exposed for any number of reasons: forexample, the amount of storage offered by storage device 120 might betoo large for a single PF to adequately support. Each PF typicallyrequires some particular hardware support. Therefore, the more PFs areto be exposed, the more hardware is typically required in the underlyingdevice, increasing its die size and power consumption, and potentiallydecreasing its performance as viewed by a single VM. Each PF may offerdifferent functionality, or multiple PFs may offer the samefunctionality (enabling a single device to offer support to more thanone VM at a time).

In FIGS. 3-4, all of the functionality described herein to offerSR-IOV-type support might be implemented in FPGA 315, or some (or all)might be implemented in storage device 120. For example, in someembodiments of the inventive concept, storage device 120 might be aconventional storage device, offering no built-in support for anySR-IOV-type functionality, with FPGA 315 responsible for all of theSR-IOV-type functionality. In other embodiments of the inventiveconcept, storage device 120 might offer support for extended I/O queuecreation commands (discussed below with reference to FIGS. 5-6), and ifstorage device 120 includes enough I/O queues to support the expectednumber of VMs, FPGA 315 might only offer VFs 325-1, 325-2, and 325-3 andmanage I/O requests from the VMs as a “pass-through” device. In yetother embodiments of the inventive concept, storage device 120 mightoffer support for extended I/O queue creation commands, but because thenumber of VMs is expected to exceed the number of I/O queues of storagedevice 120, FPGA 315 might still manage I/O queue creation internally:storage device 120 might then be treated as a conventional storagedevice despite supporting the extended I/O queue creation command.

FIG. 5 shows details of storage device 120 of FIG. 1. In FIG. 5, storagedevice 120 may include host interface 505, which manages communicationswith host device 105 of FIG. 1. Storage device 120 may also includestorage 510, which stores the actual data being accessed by VMs 305-1,305-2, and 305-3 of FIG. 3. Storage 510 may take any desired form: forexample, if storage device 120 is a Solid State Drive (SSD), thenstorage 510 may take the form of flash memory, whereas if storage device120 is a hard disk drive, then storage 510 may take the form of disks.Storage device 120 may also include doorbell distribution logic 515,which may distribute doorbells across the internal memory of storagedevice 120 so that each doorbell is located in a different page of theinternal memory (where the size of the page may be determined from thepage size of memory 130 of FIG. 1). By placing doorbells in differentpages, different VMs access different pages to access their doorbell(s),avoiding the possibility that multiple VMs might need to access the samememory page to access their doorbells (which would go against theobjective of VM isolation). And while FIG. 5 shows doorbell distributionlogic 515 as part of storage device 120, embodiments of the inventiveconcept may place doorbell distribution logic 515 in FPGA 315 of FIG. 3as well. Doorbell distribution is discussed further with reference toFIG. 8 below.

FIG. 5 does not show other conventional hardware and/or circuitry thatmight be included in storage device 120, such as a Flash TranslationLayer (FTL) and read/write circuitry for an SSD or read/write heads fora hard disk drive. Nor does FIG. 5 show additional optional componentsthat might be included in storage device 120, such as a cache.Embodiments of the inventive concept extend to include all suchconventional components.

Storage device 120 may use any desired interface for communicating withhost device 105 of FIG. 1. For example, storage device 120 may use anNVMe interface, or storage device 120 may use a Serial AT Attachmentinterface. Similarly, storage device 120 may use any desired connectionmechanism for connecting to host device 105, including, for example, PCIor PCIe (using 4, 8, or any other number of PCI lanes), M.2, and SATA,among other possibilities. Embodiments of the inventive concept areintended to include all variations on connection and interface.

Storage device 120 may also include I/O queues (also called I/O queuepairs) that may be used for request submission and response return: asubmission queue may be used by VM 305-1 of FIG. 3 to submit an I/Orequest (such as a read or write request), and a completion queue may beused by storage device 120 to return a result to VM 305-1 of FIG. 3. InFIG. 5, storage device is shown including three I/O queues 520-1, 520-2and 520-3, with submission queues 525-1, 525-2 and 525-3 respectivelyand completion queues 530-1, 530-2 and 530-3, respectively. While FIG. 5suggests that each submission queue 520-1, 520-2, and 520-3 has acorresponding completion queue 525-1, 525-2, and 525-3, embodiments ofthe inventive concept may support a single completion queue includingresults from multiple submission queues.

Storage device 120 may include circuitry to support I/O queue creationcommand 535 that may be offered by storage device 120. Using I/O queuecreation command 535, hypervisor 310 of FIG. 3, or FPGA 315 of FIG. 3,may request that an I/O queue be established for use. But whileconventional storage devices, such as those supporting the NVMeinterface, may offer a standardized form of I/O queue creation command535, I/O queue creation command 535 as shown represents an extended I/Oqueue creation command, offering additional attributes not offered orsupported by conventional I/O queue creation commands.

FIG. 6 shows extended I/O queue creation command 535 of FIG. 5 forstorage device 120 of FIG. 1. In FIG. 6, I/O queue creation command 535is shown. I/O queue creation command 535 may include parameters that arepart of the conventional NVMe specification-defined I/O queue creationcommand, such as queue size 603 (how much space should be allocated forthe queue being established), queue identifier 606 (an identifier forthe submission queue being established), completion queue identifier 609(an identifier for the completion queue), queue priority 612 (therelative priority of the submission queue being established), andphysically contiguous flag 615 (indicating whether the submission queueand the completion queue are to be physically contiguous or not). But inaddition, I/O queue creation command 535 may include other attributes.These attributes include Logical Block Address (LBA) range attribute618, Quality of Service (QoS) attribute 621, and shared namespaceattribute 624. LBA range attribute 618 may specify a range of LBAsassociated with VM 305-1. Using LBA range attribute 618, FPGA 315 ofFIG. 3 or storage device 120 of FIG. 1 may allocate a portion of thePhysical Block Addresses (PBAs) on storage device 120 of FIG. 1 for useby that VM, and not to be used by other VMs (although an exception tothis rule is discussed below). By isolating a range of LBAs and acorresponding range of PBAs, VM isolation may be provided.

FIG. 7 shows an example of how the physical storage of storage device3130 of FIG. 1 divided into multiple namespaces. In FIG. 7, 1 TB ofstorage 510 of FIG. 5 is shown, although embodiments of the inventiveconcept may support storage devices with any overall capacity. This 1 TBof storage 510 of FIG. 5 is shown divided into three namespaces 705-1,705-2, and 705-3, but embodiments of the inventive concept may dividestorage 510 of FIG. 5 into any number of namespaces.

Each namespace has an associated range of LBAs. Thus, namespace 705-1includes range of LBAs 710-1, namespace 705-2 includes range of LBAs710-2, and namespace 705-3 includes range of LBAs 710-3. Correspondingto each range of LBAs 710-1, 710-2 and 710-3, ranges of PBAs 715-1,715-2, and 715-3 may be established in storage 510 of FIG. 5. Each rangeof PBAs 715-1, 715-2, and 715-3 should be at least as large as thecorresponding range of LBAs, so that every LBA in a given range of LBAshas a corresponding PBA. Note that the LBAs and PBAs do not need to becoincidental: for example, range of LBAs 710-2 ranges from a startingLBA of 0 to an ending LBA of 134,217,727 (which may be, for example,block addresses, with each block including 4096 bytes of data: thisblock address would correspond with byte address 549,755,813,887),whereas range of PBAs 715-2 ranges from a starting PBA of 67,108,864 (abyte address of 274,877,906,944) to an ending PBA of 201,326,591 (a byteaddress of 824,633,720,831).

One advantage of mapping LBA ranges to PBAs and associating one (orboth) of those ranges with I/O queues is that such a mapping may avoidthe blender effect. The blender effect is a consequence of howconventional storage devices process I/O requests. While I/O requestsare still within the I/O queues, the I/O requests have some residualcontext. But once an I/O request has reached the FTL, any such contexthas been lost: all I/O requests look the same at that point. As aresult, storage device 120 of FIG. 1 would not be able to guarantee QoSrequirements for a particular VM. But if the PBA may be tied back to aparticular I/O queue via the LBA-PBA mapping (which is itselfreversible), the QoS requirements of that I/O queue may still be locatedand satisfied by storage device 120 of FIG. 1.

It is worth noting that each namespace 705-1, 705-2, and 705-3 has itsown namespace identifier (ID). In some embodiments of the inventiveconcept, these namespace IDs may correspond to the queue identifiersprovided as queue identifier 606 of FIG. 6.

Returning to FIG. 6, LBA range attribute 618 may be expressed in anumber of ways. For example, LBA range attribute 618 may include LBAstart 627 and LBA end 630, which provide the starting and endingaddresses for the LBAs in range of LBAs 710-1, 710-2, and 710-3 of FIG.7. Alternatively, given LBA start 627 and queue size 603, it may bepossible to infer LBA end 630, which would permit LBA end 630 to beomitted. The structure of extended I/O queue creation command 535 may beestablished to support whichever variation on LBA range attribute 618 isdesired, including and/or omitting parameters as appropriate.

QoS attribute 621 may represent any desired QoS provisions. For example,the queue being established by I/O queue creation command 535 may beassociated with a VM that has a Service Level Agreement (SLA), whichattempts to guarantee a particular level of service for the user of theVM. QoS attribute 621 may be expressed using any desired form. Someexample forms shown in FIG. 6 include minimum guaranteed bandwidth 633,maximum guaranteed bandwidth 636, minimum number of read requests persecond 639, maximum number of read requests per second 642, minimumnumber of bytes read per second 645, maximum number of bytes read persecond 648, minimum number of write requests per second 651, maximumnumber of write requests per second 654, minimum number of bytes writtenper second 657, and maximum number of bytes written per second 660. Notethat conventional SR-IOV solutions rely on hypervisor 310 of FIG. 3 tomanage QoS requirements; embodiments of the inventive concept may shiftthis management to FPGA 315 of FIG. 3 or storage device 120 of FIG. 1,reducing the load on hypervisor 310 of FIG. 3 and the host CPU load, andthereby improving overall system performance. The structure of extendedI/O queue creation command 535 may be established to support whichevervariation on QoS attribute 621 is desired, including and/or omittingparameters as appropriate.

Finally, shared namespace attribute 624 may represent a list ofnamespaces that are intended to share access to the physical storage forthe VM requesting the creation of the I/O queue. The concept of theshared namespace attribute represents the exception to the concept of VMisolation: virtual machines may not be isolated if they are sharingaccess to a common dataset. But as there may be situations wheremultiple VMs need to share information and it is more efficient for themto share access to a common dataset than to message data between theVMs, shared namespace attribute 624 offers this workaround. Sharednamespace attribute 624 may be implemented in any desired manner: oneexample is for I/O queue creation command 535 to include sharednamespace ID array 663, which lists the namespace IDs of the VMs thatare to have shared access to the dataset.

FIG. 8 shows memory mapping of doorbells in storage device 120 of FIG. 1to support VM isolation. In conventional storage devices that usedoorbells to communicate between host device 105 of FIG. 1 and storagedevice 120 of FIG. 1, there may be multiple doorbells: for example, onefor each of I/O queues 520-1, 520-2 and 520-3 of FIG. 5. To supportaccess to these doorbells by VMs 305-1, 305-2, and 305-3, storage device120 of FIG. 1 may request that a portion of the address space of hostdevice 105 of FIG. 1 be allocated to storage device 120. The addressesin the address space of host device 105 of FIG. 1 may then be mapped tothe memory addresses of storage device 105. Hypervisor 310 of FIG. 3 maythen provide the addresses for the doorbells to VMs 305-1, 305-2, and305-3 of FIG. 3, enabling VMs 305-1, 305-2, and 305-3 of FIG. 3 toaccess the doorbells by using the addresses in the address space of hostdevice 105 of FIG. 1.

But in conventional implementations of storage device 120 of FIG. 1,these doorbells may reside in a contiguous section of memory: FIG. 8illustrates this. In FIG. 8, memory 805 is shown, which representsmemory within storage device 120 of FIG. 1 or FPGA 315 of FIG. 3. Memoryaddresses 810-1, 810-2, and 810-3 for the doorbells of I/O queues 520-1,520-2, and 520-3 of FIG. 5 are shown occupying a contiguous section ofmemory, and all lie within a single page of memory 805. While FIG. 8shows three doorbells occupying memory addresses 810-1, 810-2, and810-3, embodiments of the inventive concept may include any number ofdoorbells. In situations where host device 105 of FIG. 1 operates as asingle device (without virtualization), this arrangement works fine. Butwhere host device 105 of FIG. 1 supports multiple virtual machines thatrequire isolation, having all the doorbells reside in the same page ofmemory 805 means that the VMs must share access to the same page ofmemory, defeating the requirement of VM isolation.

To address this difficulty, storage device 120 of FIG. 1 or FPGA 315 ofFIG. 3, depending on which device offers the doorbells, may locatedoorbell memory addresses in different pages of memory 130 of FIG. 1, asshown in memory 815. An example of how storage device 120 of FIG. 1 orFPGA 315 of FIG. 3 might locate doorbell memory addresses in differentpages is shown in U.S. patent application Ser. No. 14/862,145, filedSep. 22, 2015, now pending, which has been published as U.S. PatentPublication No. 2016/0306580, which is hereby incorporated by reference.By using a doorbell stride value, doorbell memory addresses may beshifted so that each doorbell is located in a different page of storagedevice 120 of FIG. 1 or FPGA 315 of FIG. 3 (in this context, the term“different page” is intended to describe a spacing so that thecorresponding doorbell addresses in the address space of host device 105of FIG. 1 are located in different pages based on the size of pages inthe O/S memory). Then, since the doorbells are located in differentpages in storage device 120 of FIG. 1 or FPGA 315 of FIG. 3, thecorresponding addresses in the address space of host device 105 of FIG.1 are also located in different pages of the O/S memory. Thus, forexample, memory address 810-1 may map to memory address 820-1, memoryaddress 810-2 may map to memory address 820-2, memory address 810-3 maymap to memory address 820-3, and so on. Since pages 825-1, 825-2, 825-3,and 825-4 may represent different pages of memory, after this mappingeach doorbell may reside in a different page of memory 815. As a result,VMs do not share access to a single page of memory 130 of FIG. 1 toaccess the doorbells, supporting VM isolation.

In FIGS. 5-7, the description centers on storage device 120 of FIG. 1offering extended I/O queue creation command 535 of FIG. 5, and storagedevice 120 of FIG. 1 enabling VM isolation. While storage device 120 ofFIG. 1 might include the necessary hardware to support VM isolation, notevery storage device necessarily includes this hardware. But byincluding FPGA 315 of FIG. 3 in the system, embodiments of the inventiveconcept may support VM isolation and SR-IOV-type access to storagedevice 120 of FIG. 1 even when storage device 120 of FIG. 1 does notinclude the hardware necessary for SR-IOV. Further, including FPGA 315of FIG. 3 in the system may enhance the functionality of storage device120 of FIG. 1 even when storage device 120 of FIG. 1 supports SR-IOVnatively.

First, FPGA 315 may include the necessary hardware to support VMisolation. Referring back to FIG. 3, since every request that would besent to storage device 120 of FIG. 1 may pass through FPGA 315 of FIG.3, FPGA 315 may intercept requests intended to be received by storagedevice 120 of FIG. 1. For example, if storage device 120 of FIG. 1 doesnot support extended I/O queue creation command 535 of FIG. 5, FPGA 315of FIG. 3 may intercept any such requests and handle I/O queue creationitself. FPGA 315 of FIG. 3 may send a conventional I/O queue creationcommand to storage device 120 of FIG. 1 while managing VM isolationitself, providing QoS guarantees, and sharing namespaces whereappropriate. To that end, FPGA 315 of FIG. 3 may include hardwaresimilar to what might otherwise be included in storage device 120 ofFIG. 1. FPGA 315 of FIG. 3 may determine whether a particular I/Orequest in an I/O queue is within range of LBAs 710 associated with thatI/O queue (or part of a shared namespace). FPGA 315 of FIG. 3 mayorganize and order I/O requests to be sent to storage device 120 of FIG.1 so that QoS guarantees are provided.

To support I/O queue creation and VM isolation, FPGA 315 of FIG. 3 mayinclude virtual I/O queue creation command 903, as shown in FIG. 9.Virtual I/O queue creation command 903 is very similar to I/O queuecreation command 535 of FIG. 6, and includes similar parameters andattributes. The primary difference is that whereas I/O queue creationcommand 535 of FIG. 6 is intended to be processed by storage device 120of FIG. 1 (although, as mentioned above, FPGA 315 of FIG. 3 mayintercept I/O queue creation command 535 of FIG. 6 and process itinternally instead), virtual I/O queue creation command 903 is directedtoward FPGA 315 of FIG. 3, and is not intended to be processed bystorage device 120 of FIG. 1.

Because virtual I/O queue creation command 903 is intended to achieve aresult similar to I/O queue creation command 535, virtual I/O queuecreation command 903 may include similar attributes/parameters to thoseof I/O queue creation command 535 of FIG. 9. Thus, virtual I/O queuecreation command may include parameters that are part of theconventional I/O queue creation command, such as queue size 906 (howmuch space should be allocated for the queue being established), queueidentifier 909 (an identifier for the submission queue beingestablished), completion queue identifier 912 (an identifier for thecompletion queue), queue priority 915 (the relative priority of thesubmission queue being established), and physically contiguous flag 918(indicating whether the submission queue and the completion queue are tobe physically contiguous or not). Virtual I/O queue creation command 903may also include extended attributes, such as LBA range attribute 921,QoS attribute 924, and shared namespace attribute 927.

Like with I/O queue creation command 535 of FIG. 5, LBA range attribute921 may be expressed in a number of ways. For example, LBA rangeattribute 921 may include LBA start 930 and LBA end 933, which providethe starting and ending addresses for the LBAs in range of LBAs 710-1,710-2, and 710-3 of FIG. 7. Alternatively, given LBA start 930 and queuesize 906, it may be possible to infer LBA end 933, which would permitLBA end 933 to be omitted. The structure of extended virtual I/O queuecreation command 903 may be established to support whichever variationon LBA range attribute 921 is desired, including and/or omittingparameters as appropriate.

Similarly, QoS attribute 924 may represent any desired QoS provisions.For example, the queue being established by virtual I/O queue creationcommand 903 may be associated with a VM that has an SLA, which attemptsto guarantee a particular level of service for the user of the VM. QoSattribute 924 may be expressed using any desired form. Some exampleforms shown in FIG. 9 include minimum guaranteed bandwidth 936, maximumguaranteed bandwidth 939, minimum number of read requests per second942, maximum number of read requests per second 945, minimum number ofbytes read per second 948, maximum number of bytes read per second 951,minimum number of write requests per second 954, maximum number of writerequests per second 957, minimum number of bytes written per second 960,and maximum number of bytes written per second 963. The structure ofextended virtual I/O queue creation command 903 may be established tosupport whichever variation on QoS attribute 924 is desired, includingand/or omitting parameters as appropriate.

Finally, shared namespace attribute 927 may represent a list ofnamespaces that are intended to share access to the physical storage forthe VM requesting the creation of the I/O queue. The concept of theshared namespace attribute represents the exception to the concept of VMisolation: virtual machines may not be isolated if they are sharingaccess to a common dataset. But as there may be situations wheremultiple VMs need to share information and it is more efficient for themto share access to a common dataset than to message data between theVMs, shared namespace attribute 927 offers this workaround. Sharednamespace attribute 927 may be implemented in any desired manner: oneexample is for virtual I/O queue creation command 903 to include sharednamespace ID array 966, which lists the namespace IDs of the VMs thatare to have shared access to the dataset.

Note that in some embodiments of the inventive concept, storage device120 of FIG. 1 may include I/O queue creation command 535 of FIG. 6, andtherefore may support I/O queues being created and assigned to VMs305-1, 305-2, and 305-3. In such embodiments of the inventive concept,FPGA 315 of FIG. 3 may simply take virtual I/O queue creation command903 and send it on to storage device 120 of FIG. 1 as I/O queue creationcommand 535 of FIG. 5, rather than processing virtual I/O queue creationcommand 903 to create a virtual I/O queue in FPGA 315 of FIG. 3.

But FPGA 315 of FIG. 3 may do more than just offload the hardware thatsupports SR-IOV-type operations (permitting FPGA 315 of FIG. 3 to beused with storage devices that do not include the SR-IOV-type hardwarethemselves). FPGA 315 of FIG. 3 may also extend the number of I/O queues“offered by” storage device 120 of FIG. 1. Since each VM on host device105 of FIG. 7 requires its own access to storage device 120 of FIG. 1,each VM on host device 105 of FIG. 1 requires access to its own VF andI/O queue on storage device 120 of FIG. 1. But the number of VFs and I/Oqueues supported by storage device 120 of FIG. 1 may represent an upperbound on the number of VMs that might access storage device 120 of FIG.1 at any point in time. This upper bound is likely far lower than thenumber of VMs hypervisor 310 of FIG. 3 (and processor 110 of FIG. 1) maysupport, which means that either VMs would be left without access tostorage device 120 of FIG. 1 or resources of host device 105 of FIG. 1would be left unused.

FPGA 315 of FIG. 3 may solve this problem by providing additionalvirtual I/O queues, which may number far higher than the number of I/Oqueues (and PFs/VFs) offered directly by storage device 120 of FIG. 1.FIG. 10 illustrates this situation.

In FIG. 10, assume that I/O queues 520-1, 520-2 and 520-3 representedthe only I/O queues offered by storage device 120 of FIG. 1 (in practicethe number of I/O queues is larger than three, but still generallysmaller than the number of VMs that host device 105 of FIG. 1 maysupport simultaneously). If host device 105 of FIG. 1 is running morethan three VMs, then one or more VMs would be left without an availableI/O queue if storage device 120 of FIG. 1 were the only source of I/Oqueues. But with FPGA 315 supporting virtual I/O queues, theseadditional VMs may still have access to storage device 120 of FIG. 1.

When hypervisor 310 of FIG. 3 issues virtual I/O queue creation command903 offered by FPGA 315, FPGA 315 may establish a new virtual I/O queue.FIG. 10 shows five virtual I/O queues 1005-1, 1005-2, 1005-3, 1005-4,and 1005-5 but FPGA 315 may support any number of virtual I/O queues(the limit being bounded by the number of gates in FPGA 315). Eachvirtual I/O queue includes its own submission and completion queues.Thus, virtual I/O queues 1005-1, 1005-2, 1005-3, 1005-4, and 1005-5includes submission queues 1010-1, 1010-2, 1010-3 1010-4, and 1010-5respectively, along with completion queues 1015-1, 1015-2, 1015-3,1015-4, and 1015-5, respectively.

Each virtual I/O queue may then be associated with a (hardware) I/Oqueue of storage device 120 of FIG. 1. FPGA 315 may use mapping logic1020, which may use any desired approach for organizing virtual I/Oqueues 1005-1 through 1005-5 into groups and mapping them to (hardware)I/O queues 520-1, 520-2, and 520-3. For example, in FIG. 10 mappinglogic 1020 may select virtual I/O queues 1005-1 and 1005-2 to form group1025-1, which is associated with I/O queue 520-1, virtual I/O queue1005-3 (by itself) to form group 1025-2, which is associated with I/Oqueue 520-2, and virtual I/O queues 1005-4 and 1005-5 to form group1025-3, which is associated with I/O queue 520-3. Thus, any I/O requestsFPGA 315 receives from VMs 305-1, 305-2, and 305-3 of FIG. 3 may be“placed” in the appropriate virtual submission queue, then passed to thecorrect (hardware) submission queue of storage device 120 of FIG. 1.Similarly, responses received from a (hardware) completion queue may be“placed” in the correct virtual completion queue and “returned” to theappropriate VM.

Mapping logic 1020 may use any desired approach for organizing virtualI/O queues 1005-1 through 1005-5 into groups, with each group associatedwith a particular (hardware) I/O queue. For example, mapping logic 1020might assign virtual I/O queues to groups randomly. Or, mapping logic1020 might assign virtual I/O queues to groups in a round-robin manner:the first virtual I/O queue assigned to the first group, the secondvirtual I/O queue to the second group, and so on until all groups haveone virtual I/O queue, after which virtual I/O queues are assigned togroups starting again with the first group. Or, mapping logic 1020 mightassign virtual I/O queues to groups based on the expected I/O loads ofVMs 305-1, 305-2, and 305-3 of FIG. 3, to attempt to balance the I/Oloads across the groups. Or, mapping logic 1020 might assign virtual I/Oqueues to groups based on the relative priorities specified for VMs305-1, 305-2, and 305-3 of FIG. 3 (note that queue priority 612 of FIG.6 is part of the conventional NVMe specification-defined I/O queuecreation command, and queue priority 915 of FIG. 9 is part of virtualI/O queue creation command 903 of FIG. 9). Or, mapping logic 1020 mightassign virtual I/O queues to groups based on QoS attribute 924 of FIG.9, in an attempt to satisfy the QoS requirements of VMs 305-1, 305-2,and 305-3. Embodiments of the inventive concept may also employ othermethodologies in determining which virtual I/O queues to assign to whichgroups.

Since FPGA 315 knows which LBAs are associated with each virtual I/Oqueue (via LBA range attribute 921 of FIG. 9 and/or shared namespaceattribute 927 of FIG. 9), FPGA 315 may enforce VM isolation by rejectingI/O requests that are not appropriate to that virtual I/O queue.Similarly, because FPGA 315 knows the QoS requirements of the VM (viaQoS attribute 924 of FIG. 9), FPGA 315 may forward I/O requests to I/Oqueues 520-1, 520-2, and 520-3 in a manner that satisfies each VM's QoSrequirements. So, for example, if a VM has established a QoS requirementof at least 10 I/O requests per second (assuming that may I/O requestsare pending), FPGA 315 may prioritize I/O requests from thecorresponding virtual I/O queue before forwarding I/O requests fromother virtual I/O queues. (This example is somewhat arbitrary, since itimplies no other VM has QoS requirements: where multiple VMs have QoSrequirements, FPGA 315 may manage I/O requests in a manner thatsatisfies all VM QoS requirements.) Other QoS requirements, such asbandwidth requirements, may be similarly handled via FPGA 315.

FIG. 11 shows a flowchart of an example procedure for storage device 120of FIG. 1 to allocating I/O queue 520 of FIG. 5 for VM 305 of FIG. 3,according to an embodiment of the inventive concept. In FIG. 11, atblock 1105, storage device 120 of FIG. 1 may receive I/O queue creationcommand 535 of FIG. 5 from hypervisor 310 of FIG. 3. Alternatively, atblock 1110, storage device 120 of FIG. 1 may receive I/O queue creationcommand 535 of FIG. 5 from FPGA 315 of FIG. 3. In either case, I/O queuecreation command 535 of FIG. 5 may include range of LBAs 710 of FIG. 7.Regardless of the source of I/O queue creation command 535 of FIG. 5, atblock 1115, storage device 120 of FIG. 1 may establish I/O queue 520 ofFIG. 1. At block 1120, storage device 120 of FIG. 1 may select range ofPBAs 715 of FIG. 7 large enough to support range of LBAs 710 of FIG. 7as received. At block 1125, storage device 120 of FIG. 1 may map rangeof LBAs 710 of FIG. 7 to range of PBAs 715 of FIG. 7. That way, whenstorage device 120 of FIG. 1 receives an I/O request, only range of PBAs715 of FIG. 7 that corresponds to VM 305 of FIG. 3 may be accessed,isolating each VM from the others. In a similar way, since the mappingbetween range of LBAs 710 of FIG. 7 and range of PBAs 715 of FIG. 7 isreversible, given a particular physical address, the appropriate contextfor the I/O request may be determined, enabling the appropriatecompletion queue to be selected to notify VM 305 of FIG. 3 that the I/Orequest is complete, as suggested in block 1130 of FIG. 11 (wherein thesuccess of the I/O request may be returned to VM 305 of FIG. 3).

Note that blocks 1105 and 1110 suggest that the extended I/O queuecreation command 535 of FIG. 5 is used, regardless of the source of I/Oqueue creation command 535 of FIG. 5. While embodiments of the inventiveconcept include such possibilities, other embodiments of the inventiveconcept may include storage device 120 of FIG. 1 that does not supportextended I/O queue creation command 535 of FIG. 5. In such embodimentsof the inventive concept, FPGA 315 of FIG. 3 may support all of thefunctionality that simulates SR-IOV with storage device 120 of FIG. 1,and storage device 120 of FIG. 1 processes I/O requests without anycontext for the I/O requests. Put another way, FPGA 315 of FIG. 3 maysupport all of the context management and VM isolation, leaving storagedevice 120 of FIG. 1 to operate as a conventional storage device withoutimplementing any of the functionality of the inventive concept.

FIG. 12 shows a flowchart of an example procedure for FPGA 315 of FIG. 3to allocating virtual I/O queue 1005 of FIG. 10 for VM 305 of FIG. 3,according to an embodiment of the inventive concept. In FIG. 12, atblock 1205, FPGA 315 of FIG. 3 may receive virtual I/O queue creationcommand 903 of FIG. 10 from hypervisor 310 of FIG. 3. At block 1210,FPGA 315 of FIG. 3 may establish virtual I/O queue 1005 of FIG. 10 forVM 305 of FIG. 3. At block 1215, FPGA 315 of FIG. 3 may send I/O queuecreation command 535 of FIG. 5 to storage device 120 of FIG. 1. Notethat I/O queue creation command 535 of FIG. 5 sent to storage device 120of FIG. 1 might be the extended version of the command if storage device120 of FIG. 1 supports it, or a conventional I/O queue creation commandif not. At block 1220, FPGA 315 of FIG. 3 may receive a result of I/Oqueue creation command 535 of FIG. 5 from storage device 120 of FIG. 1.

At block 1225, mapping logic 1020 of FIG. 10 may map virtual I/O queue1005 of FIG. 10 to I/O queue 520 of FIG. 5 as established by storagedevice 120 of FIG. 1. At block 1230, FPGA 315 of FIG. 3 may associaterange of LBAs 710 of FIG. 7 with virtual I/O queue 1005 of FIG. 10.Finally, at block 1235, FPGA 315 of FIG. 3 may return a successindicator to hypervisor 310 of FIG. 3.

A few notes above FIG. 12 are in order. First, note that while in blocks1105 and 1110 of FIG. 11 storage device 120 of FIG. 1 might not receiveextended I/O queue creation command 535 of FIG. 5 (in embodiments of theinventive concept where FPGA 315 of FIG. 3 implements all of thefunctionality of the inventive concept and storage device 120 of FIG. 1is a conventional storage device), the same is not true of FPGA 315 ofFIG. 3. Any command sent from hypervisor 310 of FIG. 3 to storage device120 of FIG. 1, or any I/O request sent from VM 305 of FIG. 3 to storagedevice 120 of FIG. 1, passes through FPGA 315 of FIG. 3. Since suchcommands would include I/O queue creation commands (either actual orvirtual), FPGA 315 of FIG. 3 may be expected to receive an extended I/Oqueue creation command. If storage device 120 of FIG. 1 may implementI/O queue creation command 535 of FIG. 5, then FPGA 315 of FIG. 3 wouldnot necessarily need to receive virtual I/O queue creation command 903of FIG. 10 at all (although embodiments of the inventive concept mayinclude hypervisor 310 of FIG. 3 sending virtual I/O queue creationcommand 903 of FIG. 10 to FPGA 315 of FIG. 3, leaving it to FPGA 315 ofFIG. 3 to either execute the command itself or issue I/O queue creationcommand 535 of FIG. 5 to storage device 120 of FIG. 1). And if FPGA 315of FIG. 3 performs queue-to-VM assignment and context management, thenFPGA 315 of FIG. 3 would want to receive virtual I/O queue creationcommand 903 of FIG. 9.

Second, note that blocks 1215 and 1220 assume that I/O queue 520 of FIG.5 has not yet been established on storage device 120. If duringexecution I/O queue 520 of FIG. 5 has already been established onstorage device 120 of FIG. 1 (for example, if FPGA 315 of FIG. 3 ismapping multiple virtual I/O queues 1005 of FIG. 10 to an individual I/Oqueue 520 of FIG. 5 on storage device 120 of FIG. 5), then blocks 1215and 1220 may be omitted, as shown by dashed line 1240.

Third, FPGA 315 of FIG. 3 may not need to store context information forvirtual I/O queue 1005 of FIG. 10. For example, if storage device 120 ofFIG. 1 maps range of LBAs 710 of FIG. 7 to range of PBAs 715 of FIG. 7and stores context information about VM 305 of FIG. 3 for I/O queue 520of FIG. 5, then block 1230 may be omitted, as shown by dashed line 1245.

FIG. 13 shows a flowchart of an example procedure for hypervisor 310 ofFIG. 3 to process administrative requests from VMs 305-1, 305-2, and305-3 of FIG. 3, according to an embodiment of the inventive concept. InFIG. 13, at block 1305, hypervisor 310 of FIG. 3 may receive anadministrative request from VM 305 of FIG. 3. At block 1310, hypervisor310 of FIG. 3 may trap the request. Then, at block 1315, hypervisor 310of FIG. 3 may send a different (if similar) request to FPGA 315 of FIG.3: this second request may simulate the original request. At block 1320,hypervisor 310 of FIG. 3 may receive a result from FPGA 315 of FIG. 3.Note that FPGA 315 of FIG. 3 might have processed the second requestinternally, or FPGA 315 of FIG. 3 might have forwarded its own requestto storage device 120 of FIG. 1: hypervisor 310 of FIG. 3 is notconcerned with how FPGA 315 of FIG. 3 processes the second request.Finally, at block 1325, hypervisor 310 of FIG. 3 may return the resultto VM 305 of FIG. 3.

FIG. 14 shows a flowchart of an example procedure for storage device 120of FIG. 1 or FPGA 315 of FIG. 3 to map memory addresses 810 of FIG. 8 ofdoorbells to different operating system pages 825 of FIG. 8 to supportVM isolation, according to an embodiment of the inventive concept. Notethat the example procedure is the same, regardless of whether theexample procedure is implemented by storage device 120 of FIG. 1 or FPGA315 of FIG. 3. For descriptive purposes, FPGA 315 of FIG. 3 will bedescribed as performing the example process, but embodiments of theinventive concept extend to storage device 120 of FIG. 1 also performingthe example process.

In FIG. 14, at block 1405, FPGA 315 of FIG. 3 may identify doorbellsused to manage communication between FPGA 315 of FIG. 3 and VM 305 ofFIG. 3. At block 1410, FPGA 315 of FIG. 3 may distribute memoryaddresses 820 of FIG. 8 across different memory pages, based on the pagesize of memory 130 of FIG. 1 in host device 105 of FIG. 1. For example,FPGA 315 of FIG. 3 might use a doorbell stride value to locate thedoorbell memory addresses 820 of FIG. 8 in different pages. At block1415, FPGA 315 of FIG. 3 may request an address space from host device105 of FIG. 1. At block 1420, FPGA 315 may map memory addresses 820 ofFIG. 8 to memory addresses in the requested address space. In thismanner, VM isolation may be maintained, since no two VMs 305 may accesstheir doorbells on a common memory page. At block 1425, FPGA 315 of FIG.3 may provide VM 305 of FIG. 3 with a new memory address 820 of FIG. 8.

At block 1430, FPGA 315 of FIG. 3 may receive a request from VM 305 ofFIG. 3 to access a doorbell at the mapped memory address. At block 1435,FPGA 315 of FIG. 3 may reverse the mapping, recovering original memoryaddress 820 of FIG. 8. At block 1440, FPGA 315 of FIG. 3 may send therequest to original memory address 820 of FIG. 8.

In FIGS. 11-14, some embodiments of the inventive concept are shown. Buta person skilled in the art will recognize that other embodiments of theinventive concept are also possible, by changing the order of theblocks, by omitting blocks, or by including links not shown in thedrawings. All such variations of the flowcharts are considered to beembodiments of the inventive concept, whether expressly described ornot.

Embodiments of the inventive concept offer several technical advantagesover the prior art. First, by removing the requirement of hardware instorage device 120 of FIG. 1 to support VM isolation and using FPGA 315of FIG. 3, any storage device may theoretically be used in a systemrequiring SR-IOV-type functionality, since FPGA 315 of FIG. 3 mayenforce VM isolation. Second, as FPGA 315 of FIG. 3 may be programed atthe time of installation, the specific desired functionality to beoffered by the system may be established at the time of installation,rather than being fixed at the point of manufacture of storage device120 (which might not provide an optimal solution for all installations).Third, FPGA 315 of FIG. 3 may offer more VFs than storage device 120 ofFIG. 1 might offer, enabling the use of storage device 120 of FIG. 1 insystems with larger numbers of VMs. Fourth, although VM isolation isprovided by FPGA 315 of FIG. 3, each VM may still have “bare-metal”access to storage device 120 of FIG. 1 for I/O requests. Fifth, doorbellmemory addresses may be remapped to different O/S memory pages, furtherenhancing VM isolation. And sixth, because context for I/O requests maybe traced back to a particular I/O queue even after being removed fromthat I/O queue, storage device 120 of FIG. 1 may still support the QoSrequirements of VMs, avoiding the blender effect.

The following discussion is intended to provide a brief, generaldescription of a suitable machine or machines in which certain aspectsof the inventive concept may be implemented. The machine or machines maybe controlled, at least in part, by input from conventional inputdevices, such as keyboards, mice, etc., as well as by directivesreceived from another machine, interaction with a virtual reality (VR)environment, biometric feedback, or other input signal. As used herein,the term “machine” is intended to broadly encompass a single machine, avirtual machine, or a system of communicatively coupled machines,virtual machines, or devices operating together. Exemplary machinesinclude computing devices such as personal computers, workstations,servers, portable computers, handheld devices, telephones, tablets,etc., as well as transportation devices, such as private or publictransportation, e.g., automobiles, trains, cabs, etc.

The machine or machines may include embedded controllers, such asprogrammable or non-programmable logic devices or arrays, ApplicationSpecific Integrated Circuits (ASICs), embedded computers, smart cards,and the like. The machine or machines may utilize one or moreconnections to one or more remote machines, such as through a networkinterface, modem, or other communicative coupling. Machines may beinterconnected by way of a physical and/or logical network, such as anintranet, the Internet, local area networks, wide area networks, etc.One skilled in the art will appreciate that network communication mayutilize various wired and/or wireless short range or long range carriersand protocols, including radio frequency (RF), satellite, microwave,Institute of Electrical and Electronics Engineers (IEEE) 802.11,Bluetooth®, optical, infrared, cable, laser, etc.

Embodiments of the present inventive concept may be described byreference to or in conjunction with associated data including functions,procedures, data structures, application programs, etc. which whenaccessed by a machine results in the machine performing tasks ordefining abstract data types or low-level hardware contexts. Associateddata may be stored in, for example, the volatile and/or non-volatilememory, e.g., RAM, ROM, etc., or in other storage devices and theirassociated storage media, including hard-drives, floppy-disks, opticalstorage, tapes, flash memory, memory sticks, digital video disks,biological storage, etc. Associated data may be delivered overtransmission environments, including the physical and/or logicalnetwork, in the form of packets, serial data, parallel data, propagatedsignals, etc., and may be used in a compressed or encrypted format.Associated data may be used in a distributed environment, and storedlocally and/or remotely for machine access.

Embodiments of the inventive concept may include a tangible,non-transitory machine-readable medium comprising instructionsexecutable by one or more processors, the instructions comprisinginstructions to perform the elements of the inventive concepts asdescribed herein.

The various operations of methods described above may be performed byany suitable means capable of performing the operations, such as varioushardware and/or software component(s), circuits, and/or module(s). Thesoftware may comprise an ordered listing of executable instructions forimplementing logical functions, and may be embodied in any“processor-readable medium” for use by or in connection with aninstruction execution system, apparatus, or device, such as a single ormultiple-core processor or processor-containing system.

The blocks or steps of a method or algorithm and functions described inconnection with the embodiments disclosed herein may be embodieddirectly in hardware, in a software module executed by a processor, orin a combination of the two. If implemented in software, the functionsmay be stored on or transmitted over as one or more instructions or codeon a tangible, non-transitory computer-readable medium. A softwaremodule may reside in Random Access Memory (RAM), flash memory, Read OnlyMemory (ROM), Electrically Programmable ROM (EPROM), ElectricallyErasable Programmable ROM (EEPROM), registers, hard disk, a removabledisk, a CD ROM, or any other form of storage medium known in the art.

Having described and illustrated the principles of the inventive conceptwith reference to illustrated embodiments, it will be recognized thatthe illustrated embodiments may be modified in arrangement and detailwithout departing from such principles, and may be combined in anydesired manner. And, although the foregoing discussion has focused onparticular embodiments, other configurations are contemplated. Inparticular, even though expressions such as “according to an embodimentof the inventive concept” or the like are used herein, these phrases aremeant to generally reference embodiment possibilities, and are notintended to limit the inventive concept to particular embodimentconfigurations. As used herein, these terms may reference the same ordifferent embodiments that are combinable into other embodiments.

The foregoing illustrative embodiments are not to be construed aslimiting the inventive concept thereof. Although a few embodiments havebeen described, those skilled in the art will readily appreciate thatmany modifications are possible to those embodiments without materiallydeparting from the novel teachings and advantages of the presentdisclosure. Accordingly, all such modifications are intended to beincluded within the scope of this inventive concept as defined in theclaims.

Embodiments of the inventive concept may extend to the followingstatements, without limitation:

Statement 1. An embodiment of the inventive concept includes a storagedevice, comprising:

storage for data; and

at least one Input/Output (I/O) queue for requests from at least onevirtual machine (VM) on a host device,

wherein the storage device supports an I/O queue creation command torequest allocation of an I/O queue of the at least one I/O queue for aVM of the at least one VM, the I/O queue creation command including anLBA range attribute for a range of Logical Block Addresses (LBAs) to beassociated with the I/O queue, and

wherein the storage device maps the range of LBAs to a range of PhysicalBlock Addresses (PBAs) in the storage.

Statement 2. An embodiment of the inventive concept includes a storagedevice according to statement 1, wherein:

the storage device includes a Solid State Drive (SSD) storage device;and

the SSD storage device uses a Non-Volatile Memory Express (NVMe)interface to the host device.

Statement 3. An embodiment of the inventive concept includes a storagedevice according to statement 2, wherein the I/O queue includes asubmission queue and a completion queue.

Statement 4. An embodiment of the inventive concept includes a storagedevice according to statement 2, wherein the storage device receives theI/O queue creation command from a hypervisor on the host device.

Statement 5. An embodiment of the inventive concept includes a storagedevice according to statement 2, wherein the LBA range attributeincludes a starting LBA and an ending LBA.

Statement 6. An embodiment of the inventive concept includes a storagedevice according to statement 2, wherein:

the LBA range attribute includes a starting LBA; and

the I/O queue creation command further includes a queue size.

Statement 7. An embodiment of the inventive concept includes a storagedevice according to statement 2, wherein the I/O queue creation commandincludes a Quality of Service (QoS) attribute for Quality of Serviceparameters for the VM.

Statement 8. An embodiment of the inventive concept includes a storagedevice according to statement 7, wherein the QoS attribute is drawn froma set including a minimum bandwidth, a maximum bandwidth, a minimumnumber of read requests per second, a maximum number of read requestsper second, a minimum number of bytes read per second, a maximum numberof bytes read per second, a minimum number of write requests per second,a maximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Statement 9. An embodiment of the inventive concept includes a storagedevice according to statement 2, wherein the I/O queue creation commandincludes a shared namespace attribute specifying an array of namespacesto share access to the range of LBAs.

Statement 10. An embodiment of the inventive concept includes a storagedevice according to statement 2, further comprising a doorbelldistribution logic to locate a first doorbell for the I/O queue in adifferent page of memory than a second doorbell for a second I/O queue.

Statement 11. An embodiment of the inventive concept includes a storagedevice according to statement 2, further comprising a Field ProgrammableGate Array (FPGA) that maps a plurality of virtual I/O queues to the I/Oqueue, wherein:

the FPGA supports a virtual I/O queue creation command to requestallocation of a first virtual I/O queue of the plurality of virtual I/Oqueues, the virtual I/O queue creation command including a second LBArange attribute for a second range of Logical Block Addresses (LBAs) tobe associated with the first virtual I/O queue, the first virtual I/Oqueue associated with the I/O queue.

Statement 12. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein the FPGA includes a mappinglogic to map the first virtual I/O queue to the I/O queue on the storagedevice, so that I/O requests received from the VM in the first virtualI/O queue are delivered to the storage device via the I/O queue andresults received from the storage device in the I/O queue are deliveredto the VM via the first virtual I/O queue.

Statement 13. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein a second virtual I/O queue ofthe FPGA is associated with the I/O queue.

Statement 14. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein the FPGA is operative toinvoke the I/O queue creation command on the storage device.

Statement 15. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein the FPGA is operative toreceive the virtual I/O queue creation command from the hypervisor onthe host device.

Statement 16. An embodiment of the inventive concept includes a storagedevice according to statement 15, wherein the FPGA is operative toreceive the virtual I/O queue creation command from the hypervisor forthe VM on the host device.

Statement 17. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein the second LBA range attributeincludes a second starting LBA and a second ending LBA.

Statement 18. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein:

the second LBA range attribute includes a second starting LBA; and

the virtual I/O queue creation command further includes a second queuesize.

Statement 19. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein the virtual I/O queue creationcommand includes a second QoS attribute for Quality of Serviceparameters for the VM.

Statement 20. An embodiment of the inventive concept includes a storagedevice according to statement 19, wherein the second QoS attribute isdrawn from a set including a second minimum bandwidth, a second maximumbandwidth, a second minimum number of read requests per second, a secondmaximum number of read requests per second, a second minimum number ofbytes read per second, a second maximum number of bytes read per second,a second minimum number of write requests per second, a second maximumnumber of write requests per second, a second minimum number of byteswritten per second, and a second maximum number of bytes written persecond.

Statement 21. An embodiment of the inventive concept includes a storagedevice according to statement 11, wherein the virtual I/O queue creationcommand includes a second shared namespace attribute specifying a secondarray of namespaces to share access to the range of LBAs.

Statement 22. An embodiment of the inventive concept includes a storagedevice according to statement 2, wherein the FPGA further includes adoorbell distribution logic to locate a first virtual doorbell for thefirst virtual I/O queue in a different page of memory than a seconddoorbell for a second virtual I/O queue.

Statement 23. An embodiment of the inventive concept includes a FieldProgrammable Gate Array (FPGA), comprising:

at least one virtual Input/Output (I/O) queue for requests from at leastone virtual machine (VM) on a host device; and

a mapping logic to map a virtual I/O queue of the at least one virtualI/O queue to an I/O queue on a storage device, so that I/O requestsreceived from the VM in the virtual I/O queue are delivered to thestorage device via the I/O queue and results received from the storagedevice in the I/O queue are delivered to the VM via the virtual I/Oqueue,

wherein the FPGA supports a virtual I/O queue creation command torequest allocation of the virtual I/O queue of the at least one virtualI/O queue for a VM of the at least one VM, the virtual I/O queuecreation command including an LBA range attribute for a range of LogicalBlock Addresses (LBAs) to be associated with the virtual I/O queue, and

wherein a storage device, separate from but connected to the FPGA, mapsthe range of LBAs to a range of Physical Block Addresses (PBAs) in thestorage device.

Statement 24. An embodiment of the inventive concept includes an FPGAaccording to statement 23, wherein the virtual I/O queue includes asubmission queue and a completion queue.

Statement 25. An embodiment of the inventive concept includes an FPGAaccording to statement 23, wherein the FPGA receives the virtual I/Oqueue creation command from a hypervisor for the VM on the host device.

Statement 26. An embodiment of the inventive concept includes an FPGAaccording to statement 23, wherein the LBA range attribute includes astarting LBA and an ending LBA.

Statement 27. An embodiment of the inventive concept includes an FPGAaccording to statement 23, wherein:

the LBA range attribute includes a starting LBA; and

the virtual I/O queue creation command further includes a queue size.

Statement 28. An embodiment of the inventive concept includes an FPGAaccording to statement 23, wherein the virtual I/O queue creationcommand includes a Quality of Service (QoS) attribute for Quality ofService parameters for the VM.

Statement 29. An embodiment of the inventive concept includes an FPGAaccording to statement 28, wherein the QoS attribute is drawn from a setincluding a minimum bandwidth, a maximum bandwidth, a minimum number ofread requests per second, a maximum number of read requests per second,a minimum number of bytes read per second, a maximum number of bytesread per second, a minimum number of write requests per second, amaximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Statement 30. An embodiment of the inventive concept includes an FPGAaccording to statement 23, wherein the virtual I/O queue creationcommand includes a shared namespace attribute specifying an array ofnamespaces to share access to the range of LBAs.

Statement 31. An embodiment of the inventive concept includes an FPGAaccording to statement 23, wherein the FPGA maps a plurality of virtualI/O queues to the I/O queue on the storage device.

Statement 32. An embodiment of the inventive concept includes an FPGAaccording to statement 31, wherein the FPGA is operative to invoke anI/O queue creation command on the storage device to create the I/Oqueue.

Statement 33. An embodiment of the inventive concept includes an FPGAaccording to statement 32, wherein the I/O queue creation commandincludes a second LBA attribute for a second range of LBAs to beassociated with the I/O queue.

Statement 34. An embodiment of the inventive concept includes a storagedevice according to statement 23, further comprising a doorbelldistribution logic to locate a first virtual doorbell for the virtualI/O queue in a different page of memory than a second doorbell for asecond virtual I/O queue.

Statement 35. An embodiment of the inventive concept includes a method,comprising:

receiving an I/O queue creation command at a storage device for a firstVM on a host device, the I/O queue creation command including at leastan LBA range attribute for a range of Logical Block Addresses (LBAs) tobe associated with an I/O queue on the storage device;

establishing the I/O queue on the storage device;

selecting a range of Physical Block Addresses (PBAs) on the storagedevice at least as large as the range of the LBAs;

mapping the range of LBAs to the range of PBAs; and

returning a success indicator,

wherein a second VM on the host device is thereby denied access to therange of PBAs.

Statement 36. An embodiment of the inventive concept includes a methodaccording to statement 35, wherein:

the storage device includes a Solid State Drive (SSD) storage device;and

the SSD storage device uses a Non-Volatile Memory Express (NVMe)interface to the host device.

Statement 37. An embodiment of the inventive concept includes a methodaccording to statement 36, wherein the I/O queue includes a submissionqueue and a completion queue.

Statement 38. An embodiment of the inventive concept includes a methodaccording to statement 36, wherein receiving an I/O queue creationcommand for a first VM on a host device includes receiving the I/O queuecreation command from a hypervisor for the first VM on the host device.

Statement 39. An embodiment of the inventive concept includes a methodaccording to statement 36, wherein receiving an I/O queue creationcommand for a first VM on a host device includes receiving the I/O queuecreation command from a Field Programmable Gate Array (FPGA) for thefirst VM on the host device.

Statement 40. An embodiment of the inventive concept includes a methodaccording to statement 39, wherein the FPGA is included in the storagedevice.

Statement 41. An embodiment of the inventive concept includes a methodaccording to statement 39, wherein the FPGA is separate from butconnected to the storage device.

Statement 42. An embodiment of the inventive concept includes a methodaccording to statement 36, wherein the LBA range attribute includes astarting LBA and an ending LBA.

Statement 43. An embodiment of the inventive concept includes a methodaccording to statement 36, wherein:

the LBA range attribute includes a starting LBA; and

the I/O queue creation command further includes a queue size.

Statement 44. An embodiment of the inventive concept includes a methodaccording to statement 36, wherein the I/O queue creation commandincludes a Quality of Service (QoS) attribute for Quality of Serviceparameters for the first VM.

Statement 45. An embodiment of the inventive concept includes a methodaccording to statement 44, wherein the QoS attribute is drawn from a setincluding a minimum bandwidth, a maximum bandwidth, a minimum number ofread requests per second, a maximum number of read requests per second,a minimum number of bytes read per second, a maximum number of bytesread per second, a minimum number of write requests per second, amaximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Statement 46. An embodiment of the inventive concept includes a methodaccording to statement 36, wherein the I/O queue creation commandincludes a shared namespace attribute specifying an array of namespacesto share access to the range of LBAs.

Statement 47. An embodiment of the inventive concept includes a methodaccording to statement 36, further comprising:

identifying a plurality of doorbells associated with a plurality of I/Oqueues;

assigning each of the plurality of doorbells to a memory address in afirst plurality of memory addresses in the storage device, the firstplurality of memory addresses distributed across a plurality of memorypages;

requesting an address space from the host device;

mapping the first plurality of memory addresses to a second plurality ofmemory addresses in the address space, the second plurality of memoryaddresses distributed across a plurality of memory pages; and

providing at least a doorbell address of the second plurality memoryaddresses to the first VM.

Statement 48. An embodiment of the inventive concept includes a methodaccording to statement 47, further comprising:

receiving a doorbell request to access the doorbell address;

mapping the doorbell address back to an address in the first pluralityof memory addresses; and

sending the doorbell request to the address in the first plurality ofmemory addresses.

Statement 49. An embodiment of the inventive concept includes a method,comprising:

receiving a virtual I/O queue creation command at a Field ProgrammableGate Array (FPGA) from a hypervisor for a first VM on a host device, thevirtual I/O queue creation command including at least an LBA rangeattribute for a range of Logical Block Addresses (LBAs) to be associatedwith an I/O queue on the storage device;

establishing a first virtual I/O queue on the FPGA;

sending an I/O queue creation command to a storage device to establishan I/O queue on the storage device;

receiving a result from the storage device;

mapping the first virtual I/O queue to the I/O queue;

associating the range of LBAs with the first virtual I/O queue; and

returning a success indicator to the hypervisor,

wherein the range of LBAs maps to a range of Physical Block Addresses(PBAs) on the storage device, and

wherein a second VM on the host device is thereby denied access to therange of PBAs.

Statement 50. An embodiment of the inventive concept includes a methodaccording to statement 49, wherein the first virtual I/O queue includesa submission queue and a completion queue.

Statement 51. An embodiment of the inventive concept includes a methodaccording to statement 49, wherein the FPGA maps both the first virtualI/O queue and a second virtual I/O queue to the I/O queue.

Statement 52. An embodiment of the inventive concept includes a methodaccording to statement 49, wherein the I/O queue creation commandincludes the LBA range attribute for the range of LBAs.

Statement 53. An embodiment of the inventive concept includes a methodaccording to statement 49, wherein the LBA range attribute includes astarting LBA and an ending LBA.

Statement 54. An embodiment of the inventive concept includes a methodaccording to statement 49, wherein:

the LBA range attribute includes a starting LBA; and

the virtual I/O queue creation command further includes a queue size.

Statement 55. An embodiment of the inventive concept includes a methodaccording to statement 49, wherein the virtual I/O queue creationcommand includes a Quality of Service (QoS) attribute for Quality ofService parameters for the first VM.

Statement 56. An embodiment of the inventive concept includes a methodaccording to statement 55, wherein the QoS attribute is drawn from a setincluding a minimum bandwidth, a maximum bandwidth, a minimum number ofread requests per second, a maximum number of read requests per second,a minimum number of bytes read per second, a maximum number of bytesread per second, a minimum number of write requests per second, amaximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Statement 57. An embodiment of the inventive concept includes a methodaccording to statement 49, wherein the virtual I/O queue creationcommand includes a shared namespace attribute specifying an array ofnamespaces to share access to the range of LBAs.

Statement 58. An embodiment of the inventive concept includes a methodaccording to statement 49, further comprising:

identifying a plurality of doorbells associated with a plurality of I/Oqueues;

assigning each of the plurality of doorbells to a memory address in afirst plurality of memory addresses in the FPGA, the first plurality ofmemory addresses distributed across a plurality of memory pages;

requesting an address space from the host device;

mapping the first plurality of memory addresses to a second plurality ofmemory addresses in the address space, the second plurality of memoryaddresses distributed across a plurality of memory pages; and

providing at least a doorbell address of the second plurality memoryaddresses to the first VM.

Statement 59. An embodiment of the inventive concept includes a methodaccording to statement 58, further comprising:

receiving a doorbell request to access the doorbell address;

mapping the doorbell address back to an address in the first pluralityof memory addresses; and

sending the doorbell request to the address in the first plurality ofmemory addresses.

Statement 60. An embodiment of the inventive concept includes a method,comprising:

receiving a first request from a virtual machine (VM) on a host device,the first request destined for a storage device;

trapping the first request from reaching the storage device;

sending a second request to a Field Programmable Gate Array (FPGA), thesecond request simulating the first request;

receiving a result of the second request from the FPGA; and

sending the result of the second request to the VM.

Statement 61. An embodiment of the inventive concept includes a methodaccording to statement 60, wherein:

the first request includes a first PCI configuration space request toaccess a first PCI configuration space; and

the second request includes a second PCI configuration space request toaccess a second PCI configuration space.

Statement 62. An embodiment of the inventive concept includes a methodaccording to statement 60, wherein:

the first request includes an I/O queue creation request for the storagedevice to create an I/O queue; and

the second request includes a virtual I/O queue creation request for theFPGA to create a virtual I/O queue.

Statement 63. An embodiment of the inventive concept includes a methodaccording to statement 62, wherein the second request further includes aLogical Block Address (LBA) attribute for a range of LBAs to beassociated with the virtual I/O queue.

Statement 64. An embodiment of the inventive concept includes a methodaccording to statement 63, wherein the virtual I/O queue creationrequest includes a shared namespace attribute specifying an array ofnamespaces to share access to the range of LBAs.

Statement 65. An embodiment of the inventive concept includes a methodaccording to statement 62, wherein the second request further includes aQuality of Service (QoS) attribute for Quality of Service parameters forthe VM.

Statement 66. An embodiment of the inventive concept includes a methodaccording to statement 65, wherein the QoS attribute is drawn from a setincluding a minimum bandwidth, a maximum bandwidth, a minimum number ofread requests per second, a maximum number of read requests per second,a minimum number of bytes read per second, a maximum number of bytesread per second, a minimum number of write requests per second, amaximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Statement 67. An embodiment of the inventive concept includes anarticle, comprising a non-transitory storage medium, the non-transitorystorage medium having stored thereon instructions that, when executed bya machine, result in:

receiving an I/O queue creation command at a storage device for a firstVM on a host device, the I/O queue creation command including at leastan LBA range attribute for a range of Logical Block Addresses (LBAs) tobe associated with an I/O queue on the storage device;

establishing the I/O queue on the storage device;

selecting a range of Physical Block Addresses (PBAs) on the storagedevice at least as large as the range of the LBAs;

mapping the range of LBAs to the range of PBAs; and

returning a success indicator,

wherein a second VM on the host device is thereby denied access to therange of PBAs.

Statement 68. An embodiment of the inventive concept includes an articleaccording to statement 67, wherein:

the storage device includes a Solid State Drive (SSD) storage device;and

the SSD storage device uses a Non-Volatile Memory Express (NVMe)interface to the host device.

Statement 69. An embodiment of the inventive concept includes an articleaccording to statement 68, wherein the I/O queue includes a submissionqueue and a completion queue.

Statement 70. An embodiment of the inventive concept includes an articleaccording to statement 68, wherein receiving an I/O queue creationcommand for a first VM on a host device includes receiving the I/O queuecreation command from a hypervisor for the first VM on the host device.

Statement 71. An embodiment of the inventive concept includes an articleaccording to statement 68, wherein receiving an I/O queue creationcommand for a first VM on a host device includes receiving the I/O queuecreation command from a Field Programmable Gate Array (FPGA) for thefirst VM on the host device.

Statement 72. An embodiment of the inventive concept includes an articleaccording to statement 71, wherein the FPGA is included in the storagedevice.

Statement 73. An embodiment of the inventive concept includes an articleaccording to statement 71, wherein the FPGA is separate from butconnected to the storage device.

Statement 74. An embodiment of the inventive concept includes an articleaccording to statement 68, wherein the LBA range attribute includes astarting LBA and an ending LBA.

Statement 75. An embodiment of the inventive concept includes an articleaccording to statement 68, wherein:

the LBA range attribute includes a starting LBA; and

the I/O queue creation command further includes a queue size.

Statement 76. An embodiment of the inventive concept includes an articleaccording to statement 68, wherein the I/O queue creation commandincludes a Quality of Service (QoS) attribute for Quality of Serviceparameters for the first VM.

Statement 77. An embodiment of the inventive concept includes an articleaccording to statement 76, wherein the QoS attribute is drawn from a setincluding a minimum bandwidth, a maximum bandwidth, a minimum number ofread requests per second, a maximum number of read requests per second,a minimum number of bytes read per second, a maximum number of bytesread per second, a minimum number of write requests per second, amaximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Statement 78. An embodiment of the inventive concept includes an articleaccording to statement 68, wherein the I/O queue creation commandincludes a shared namespace attribute specifying an array of namespacesto share access to the range of LBAs.

Statement 79. An embodiment of the inventive concept includes an articleaccording to statement 68, further comprising:

identifying a plurality of doorbells associated with a plurality of I/Oqueues;

assigning each of the plurality of doorbells to a memory address in afirst plurality of memory addresses in the storage device, the firstplurality of memory addresses distributed across a plurality of memorypages;

requesting an address space from the host device;

mapping the first plurality of memory addresses to a second plurality ofmemory addresses in the address space, the second plurality of memoryaddresses distributed across a plurality of memory pages; and

providing at least a doorbell address of the second plurality memoryaddresses to the first VM.

Statement 80. An embodiment of the inventive concept includes an articleaccording to statement 79, further comprising:

receiving a doorbell request to access the doorbell address;

mapping the doorbell address back to an address in the first pluralityof memory addresses; and

sending the doorbell request to the address in the first plurality ofmemory addresses.

Statement 81. An embodiment of the inventive concept includes anarticle, comprising a non-transitory storage medium, the non-transitorystorage medium having stored thereon instructions that, when executed bya machine, result in:

receiving a virtual I/O queue creation command at a Field ProgrammableGate Array (FPGA) from a hypervisor for a first VM on a host device, thevirtual I/O queue creation command including at least an LBA rangeattribute for a range of Logical Block Addresses (LBAs) to be associatedwith an I/O queue on the storage device;

establishing a first virtual I/O queue on the FPGA;

sending an I/O queue creation command to a storage device to establishan I/O queue on the storage device;

receiving a result from the storage device;

mapping the first virtual I/O queue to the I/O queue;

associating the range of LBAs with the first virtual I/O queue; and

returning a success indicator to the hypervisor,

wherein the range of LBAs maps to a range of Physical Block Addresses(PBAs) on the storage device, and

wherein a second VM on the host device is thereby denied access to therange of PBAs.

Statement 82. An embodiment of the inventive concept includes an articleaccording to statement 81, wherein the first virtual I/O queue includesa submission queue and a completion queue.

Statement 83. An embodiment of the inventive concept includes an articleaccording to statement 81, wherein the FPGA maps both the first virtualI/O queue and a second virtual I/O queue to the I/O queue.

Statement 84. An embodiment of the inventive concept includes an articleaccording to statement 81, wherein the I/O queue creation commandincludes the LBA range attribute for the range of LBAs.

Statement 85. An embodiment of the inventive concept includes an articleaccording to statement 81, wherein the LBA range attribute includes astarting LBA and an ending LBA.

Statement 86. An embodiment of the inventive concept includes an articleaccording to statement 81, wherein:

the LBA range attribute includes a starting LBA; and

the virtual I/O queue creation command further includes a queue size.

Statement 87. An embodiment of the inventive concept includes an articleaccording to statement 81, wherein the virtual I/O queue creationcommand includes a Quality of Service (QoS) attribute for Quality ofService parameters for the first VM.

Statement 88. An embodiment of the inventive concept includes an articleaccording to statement 87, wherein the QoS attribute is drawn from a setincluding a minimum bandwidth, a maximum bandwidth, a minimum number ofread requests per second, a maximum number of read requests per second,a minimum number of bytes read per second, a maximum number of bytesread per second, a minimum number of write requests per second, amaximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Statement 89. An embodiment of the inventive concept includes an articleaccording to statement 81, wherein the virtual I/O queue creationcommand includes a shared namespace attribute specifying an array ofnamespaces to share access to the range of LBAs.

Statement 90. An embodiment of the inventive concept includes an articleaccording to statement 81, further comprising:

identifying a plurality of doorbells associated with a plurality of I/Oqueues;

assigning each of the plurality of doorbells to a memory address in afirst plurality of memory addresses in the FPGA, the first plurality ofmemory addresses distributed across a plurality of memory pages;

requesting an address space from the host device;

mapping the first plurality of memory addresses to a second plurality ofmemory addresses in the address space, the second plurality of memoryaddresses distributed across a plurality of memory pages; and

providing at least a doorbell address of the second plurality memoryaddresses to the first VM.

Statement 91. An embodiment of the inventive concept includes an articleaccording to statement 90, further comprising:

receiving a doorbell request to access the doorbell address;

mapping the doorbell address back to an address in the first pluralityof memory addresses; and

sending the doorbell request to the address in the first plurality ofmemory addresses.

Statement 92. An embodiment of the inventive concept includes anarticle, comprising a non-transitory storage medium, the non-transitorystorage medium having stored thereon instructions that, when executed bya machine, result in:

receiving a first request from a virtual machine (VM) on a host device,the first request destined for a storage device;

trapping the first request from reaching the storage device;

sending a second request to a Field Programmable Gate Array (FPGA), thesecond request simulating the first request;

receiving a result of the second request from the FPGA; and

sending the result of the second request to the VM.

Statement 93. An embodiment of the inventive concept includes an articleaccording to statement 92, wherein:

the first request includes a first PCI configuration space request toaccess a first PCI configuration space; and

the second request includes a second PCI configuration space request toaccess a second PCI configuration space.

Statement 94. An embodiment of the inventive concept includes an articleaccording to statement 92, wherein:

the first request includes an I/O queue creation request for the storagedevice to create an I/O queue; and

the second request includes a virtual I/O queue creation request for theFPGA to create a virtual I/O queue.

Statement 95. An embodiment of the inventive concept includes an articleaccording to statement 94, wherein the second request further includes aLogical Block Address (LBA) attribute for a range of LBAs to beassociated with the virtual I/O queue.

Statement 96. An embodiment of the inventive concept includes an articleaccording to statement 95, wherein the virtual I/O queue creationrequest includes a shared namespace attribute specifying an array ofnamespaces to share access to the range of LBAs.

Statement 97. An embodiment of the inventive concept includes an articleaccording to statement 94, wherein the second request further includes aQuality of Service (QoS) attribute for Quality of Service parameters forthe VM.

Statement 98. An embodiment of the inventive concept includes an articleaccording to statement 97, wherein the QoS attribute is drawn from a setincluding a minimum bandwidth, a maximum bandwidth, a minimum number ofread requests per second, a maximum number of read requests per second,a minimum number of bytes read per second, a maximum number of bytesread per second, a minimum number of write requests per second, amaximum number of write requests per second, a minimum number of byteswritten per second, and a maximum number of bytes written per second.

Consequently, in view of the wide variety of permutations to theembodiments described herein, this detailed description and accompanyingmaterial is intended to be illustrative only, and should not be taken aslimiting the scope of the inventive concept. What is claimed as theinventive concept, therefore, is all such modifications as may comewithin the scope and spirit of the following claims and equivalentsthereto.

What is claimed is:
 1. A Field Programmable Gate Array (FPGA),comprising: at least one virtual Input/Output (I/O) queue for requestsfrom at least one virtual machine (VM) on a host device; and a mappinglogic to map a virtual I/O queue of the at least one virtual I/O queueto an I/O queue on a storage device, so that I/O requests received fromthe VM in the virtual I/O queue are delivered to the storage device viathe I/O queue and results received from the storage device in the I/Oqueue are delivered to the VM via the virtual I/O queue, wherein theFPGA supports a virtual I/O queue creation command to request allocationof the virtual I/O queue of the at least one virtual I/O queue for a VMof the at least one VM, the virtual I/O queue creation command includingan LBA range attribute for a range of Logical Block Addresses (LBAs) tobe associated with the virtual I/O queue, and wherein a storage device,separate from but connected to the FPGA, maps the range of LBAs to arange of Physical Block Addresses (PBAs) in the storage device.
 2. AnFPGA according to claim 1, wherein the LBA range attribute includes astarting LBA and an ending LBA.
 3. An FPGA according to claim 1, whereinthe virtual I/O queue creation command includes a Quality of Service(QoS) attribute for Quality of Service parameters for the VM.
 4. An FPGAaccording to claim 1, wherein the virtual I/O queue creation commandincludes a shared namespace attribute specifying an array of namespacesto share access to the range of LBAs.
 5. An FPGA according to claim 1,wherein the FPGA maps a plurality of virtual I/O queues to the I/O queueon the storage device.
 6. An FPGA according to claim 5, wherein the FPGAis operative to invoke an I/O queue creation command on the storagedevice to create the I/O queue.
 7. An FPGA according to claim 1, furthercomprising a doorbell distribution logic to locate a first virtualdoorbell for the virtual I/O queue in a different page of memory than asecond doorbell for a second virtual I/O queue.